Construction of a Chinese-english Verb Lexicon for Embedded Machine Translation in Cross-language Information Retrieval

نویسندگان

  • Bonnie Jean Dorr
  • Dekang Lin
چکیده

This paper addresses the problem of automatic acquisition of lexical knowledge for rapid construction of MT engines multilingual applications. We describe new techniques for large-scale construction of a Chinese-English verb lexicon and we evaluate the coverage and eeectiveness of the resulting lexicon for a structured MT approach that is embedded in a cross-language information retrieval system. Leveraging oo an existing Chinese conceptual database called HowNet and a large, semantically rich English verb database, we use thematic-role information to create links between Chinese concepts and English classes. We apply the metrics of recall and precision to evaluate the coverage and eeectiveness of the linguistic resources. The results of this work indicate that: (1) we are able to obtain reliable Chinese-English entries both with and without pre-existing semantic links between the two languages; (2) if we have pre-existing semantic links, we are able to produce a more robust lexical resource by merging these with our semantically rich English database; (3) In our comparisons with manual lexicon creation, our automatic techniques were shown to achieve 62% precision, compared to a much lower precision of 10% for arbitrary assignment of semantic links.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Building a Chinese-English Mapping between Verb Concepts for Multilingual Applications

This paper addresses the problem of building conceptual resources for multilingual applications. We describe new techniques for large-scale construction of a Chinese-English lexicon for verbs, using thematic-role information to create links between Chinese and English conceptual information. We then present an approach to compensating for gaps in the existing resources. The resulting lexicon is...

متن کامل

Chinese-English Semantic Resource Construction

We describe an approach to large-scale construction of a semantic lexicon for Chinese verbs. We leverage off of three existing resources— a classification of English verbs called EVCA (English Verbs Classes and Alternations) (Levin, 1993), a Chinese conceptual database called HowNet (Zhendong, 1988c; Zhendong, 1988b; Zhendong, 1988a) (http://www.how-net.com), and a large machine-readable dictio...

متن کامل

Construction of Chinese-english Semantic Hierarchy for Information Retrieval

This paper describes an approach to large-scale construction of a semantic hierarchy for Chinese verbs. Leveraging oo of an existing Chinese conceptual database called HowNet and a Levin-based English verb classiication, we use thematic-role information to create links between Chinese concepts and English classes. The resulting hierarchy is used for multilingual lexicons in an English-Chinese c...

متن کامل

Comparing Multiple Methods for Japanese and Japanese-English Text Retrieval

The NACSIS collection of Japanese scienti c documents (with English titles) provides a solid foundation for information retrieval research into 1) segmentation methods for Japanese text, 2) e ective methods for monolingual Japanese retrieval, and 3) JapaneseEnglish cross-language retrieval. This paper compares multiple methods for Japanese and Japanese-English text retrieval. Our focus is on ac...

متن کامل

Evaluating Resources for Query Translation in Cross-Language Information Retrieval

Our goal is to evaluate the utility of a lexical resource containing Lexical Conceptual Structures LCS for use in cross language information retrieval Our evaluation makes use of a combination of techniques from interlingual machine translation Dorr with conventional information retrieval techniques Oard OardandDorr Given a query in one language we transform the query into the corresponding ter...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002